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1. INTRODUCTION 

Many multipliers are used to achieve low power and high-speed performance, In DSP systems, most 
of the DSP applications are designed for power dissipation and components used as multipliers [1]—[5] and to 
perform various high-speed operations multiplications play a major role in winding up the design. Mainly 
multiplication is an algorithm used at a structural level. Multi-dimension multiplication is done by the 
systolic array multipliers, those multipliers are a sequence of channels and it’s a pipe lining process with a 
linear arrangement. When the multiplication process happens, it stores the information itself and processes it 
to the next pipeline level, and maintains a pipelining process, each block of the systolic array multiplier is 
fixed and looks similar. The simultaneous process performs in systolic arrays which increases the speed of 
the system and reduces the processing time with perfect efficiency of the output. Systolic array Multipliers 
are used for sorting and convolution techniques. 

In this paper, we developed a systolic array design with the new model gate which decreases the 
delay and increases the speed of the operation, first of the multiplicand and multiplier are arranged in an 
array structure, and from the both of each bit is collected and do multiplicand, and its processes to the later 
pipeline stage, partial products, and carry generation done in the later stages. From the statement of the great 
scientist Landauer energy is dissipated at each bit of lost when transmits data with a particular amount of 
energy, the basic formula for calculating the loss of each bit of energy dissipated as KT*log2, T defines 
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absolute temperature and K Defines Boltzmann’s Constant. Reversible logic proved that we can minimize the 
dissipation of the heat by Charles Bennet [6], [7]. Reversible design is the future for developing circuits for 
low power and high-speed operations with very few system designs used. The main structures of the 
reversible gates are designed in such a way that the number of inputs is equal to the number of outputs. By 
this, it improves the overall performance of the systems [8]—[10]. In this paper systolic array multiplier is 
designed using reversible technology; it means all the components of the design use reversible gates to 
achieve the low power targets. Most of the system designs are being developed by reversible gates but testing 
was more complex and to reach the time to market it depends on the way of testing. 

In the existing paper [11], [12] developed the systolic array multiplier with reversible gates, and 
proposed a multiplier for 4x4 systolic array design which calculates partial products and passes the partial 
products for carrying select generation, the testing to be done but simulated the design using the design tools 
and verified only parts of the design through simulations. In this paper [13], [14] they have proposed a new 
level of testing using BILBO logic where we can find the number of faults, but they have tested for Baugh 
Wooley multiplier designs. Most of the Baugh Wooley designs are used for high-speed operations, and also 
when we change the increased number of the bits for operations, we required more logic for the testing and 
implementation. The researchers [15], [16] addressed fault analysis techniques for computing multipliers by 
reviewing different methodologies of converting matrix algorithms to a predefined systolic array designs and 
then introduces array structure of the systolic part designs which was originally designed by the Lang and 
Moreno. Morghade et al. [17] Proved the design was correct by using the simulations and all the logic that 
implemented was algorithms for multiplication, division and direct multiplications methods, have examined 
various methods of testing they come up with LFSR technique which generates the random number of values 
for testing and applied and got succeeded and then moved for shift register designs which actually increases 
the area of the chip. The researchers [18]—[20] proposes a new method of approach for reducing the power 
consumption on an irreversible array multiplier and also using the reversible logic designs for the systolic 
array multiplier designs, which they expected to get high-end of the efficiency of the output in which 
compared with existing they end up with good results and also tested with 90ns CMOS nanometer 
technology. The researchers [21]—[24] which comes over a GF has made a bright application over the 
security of the multiplications and developed systolic array multiplier design over GF multiplier designs with 
full pattern generator using a six-bit counter and generate number of patterns required for the testing of the 
system designs for GF multiplier designs where it increases delay in the circuit and in the proposed system, 
we have overcome the issue of the delay removal of GF in the proposed system. The Proposed system of the 
research is to design an advanced systolic array multiplier with a new modified gate and test using fault 
injection method using BILBO logic for generating different patterns of test vectors. 


2. RESEARCH METHOD 

Nowadays many low-power applications use reversible gate designs for low area and power. 
Because the logic present in reversible gates like no of input variables is equal to the number of output 
variables [25] where the utilization of power is used equally for fan outs, it is used for low power relevance 
designs. Quantum cost also reduces with the main logic involved in reversible designs. The majority plays an 
important role in reducing power dissipation due to the garbage and constant inputs used, when the circuit 
has garbage outputs power utilization is reduced due to which power loss is less. Reversible logic design 
selected for the project for low power dissipation and the reversible gate has been modified and is used for 
full adder design circuit, namely modified Islam gate shown in Figures 1 and 2. Modified Islam gate has 4 
inputs and 4 outputs which output reflect as full adder model designs usage. 

We have used controlled operational gate design which is used for getting full adder to carry select 
block, COG gate has now inputs and outs are equal i.e., 3, where logic completely depends on the second and 
third input variables, based on the status of that variables logic changes and works for full adder carrier 
output. Mostly COG reversible designs used for low power circuits in DSP Application for having the 
number of multiplier designs to get partial products intern to get resultant carry generation blocks, in our 
project we defined for the usage of carrier output. 


. Modified ISLAM Gate B Controlled Operational Gate Q=AC+A'B 


Figure 1. Reversible modified Islam gate Figure 2. Reversible controlled operational gate 
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By Integrating the above two reversible gates MIG and COG we get a complete adder and 
subtractor, which is used for systolic array multiplication, in systolic array multiplication it is used for 
multiplication, the process will be briefed in section 2.2. Mostly reversible full adder in Figure 3 plays a 
major role in any of the applications like video, medical, and many digital world systems 


S=Sum D=Difference C=Carry B=Borrow 
> indicates garbage outputs(g1,¢2..gn) 
"mrrre®_ indicates 0 


Figure 3. Reversible full adder /full subtractor 


2.1. Systolic array multiplier cell block 

Systolic array multiplier cell block is the special block for multiplication operation in which 
integration of sub designs like COG, MIG, and a complete formation of full adder block used for getting the 
resultant of the partial product of the design. For the function of the gate, the operation used the Toffoli gate 
which perfectly fits to reduce the power of the circuit. Multiplier cell block starts by taking individual bits of 
each of the Toffoli gate block as multiplicand and multiplier and generates the partial products with the usage 
of the reversible full adder design block. It is also a pipelining process in the systolic array multiplication 
model. Proposed full adder using reversible gates used for generating resultant and carry. Many of the 
instances of the block are used for reducing the coding of the design and re-use method performed, when one 
gets inputs other will be in the processing stage, and the Same way the process continues whole instances 
gets inputs and generate sums and carries. We are using a 4-bit multiplication process in which 16 multiplier 
cells are used for getting the full results of the systolic array multiplier. All the operations will be in the 
pipeline process and scheduled with each block to perform to get the value of the assigned bit and send to the 
other block and vice versa. Need to be very careful at the time of integrating the output of one block carry to 
the other multipliers cell block as shown in Figure 4, it may mislead the design for the wrong operation, it 
should be according to Endian format righted to left addition or connecting of the designs to the previous 
block of the carry bit. 


Multiplier x0,yO Multiplicand x1,y1 


Toffoli Gate Toffoli Gate 


Proposed FA/SUB 
Reversible Gate 


— 
<< 


Sum (s0),carry(cO) 


Figure 4. Multiplier cell block 


2.2. System design & testing method 

Proposed system systolic array multiplier design and testing are to verify the multiplier corner cases 
as it is very complexing in finding the faults and compare the faults with existing system designs and 
improvement over the area, speed, power and find the faults. The proposed system mainly consists of four 
main blocks DUT, GRM, BILBO, and a checker as shown in Figure 5. Design under test which is proposed 
systolic array design, where mainly multiplication process goes on, Systolic Array multiplier developed using 
reversible gates and compared to the existing design we have proposed a new gate which performs faster than 
existing systolic array multiplier design. Multiple data bits are used for multiplication purposes. Mainly in 
systolic array multiplier design consists of 4 stages, whereas in 3 stages carry generated by the multiplier cell 
blocks were moved to the other stage multiplier cellblock design, whereas in stage 4 side by side the carrier 
moves to generate the final results of the multiplier block. Golden reference models are used for many of the 
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testing and verification SOC designs, GRM the coding part can be user-defined and it can be of any 
language, but it should work exactly as DUT, In this project reference model is taken as VHDL model for 
easy understanding of the flow of the multiplier at each stage of the block, when BIBLO generates the 
patterns, GRM also picks up the values and used for generating the outputs, the main focus is to get injected 
with the sum of the faults into the reference design, with the BILBO logic to get compared with the signature 
values to get the exact faults where has injected. 


Fault 
injected 


Golden Reference Model 


Checker 


Inputs Outputs 


Design Under Test 


Figure 5. The proposed system with DUT and all required components 


DUT Circuit which is used for testing could be placed in middle of the BILBOs, which are mostly 
working in the relevant modes as Linear feedback shift register and MISR modes. To test the circuit of SAM, 
a 4-bit multiplier design and an 8-bit BILBO were used. YAG gate design [26] is used for generating sum 
and product terms simultaneously. Input signal always in SCAN Mode If the BILBO uses LFSR mode, it 
generates the no of patterns required for the multiplier and the multiplier takes the inputs and intakes the 
output to the BILBO, which performs the operations to generate the signature like MISR Mode. If there is a 
signature produced for no-fault injection circuits called a good signature. Now the process begins will inject 
the faults in the design and generate the LFSR mode and gets patterns and generates the signature and that 
signature compares with the existing signature. If both matches, it proves testing did not happen correctly or 
fault is not identified by BILBO, if not BILBO detected fault. Checkers are most common in verification 
areas; checkers are named as scoreboard logics in which the two different data received from two blocks are 
to be compared and verified whether matched or mismatched to get the resultant of usage of DUT. Checkers 
are coded in the environment and tested the SAM circuit by injecting faults and by not injecting faults. In this 
project, a comparison is done between GRM and DUT outputs and storing the resultant for future usage. 

As the process starts BILBO starts generating the patterns using modes, those patterns carried out 
within the environment and given to reversible systolic array multiplier, it processes the number of patterns it 
receives as it works as a pipeline stage multiplier, it generates the resultant and gives to the checker logic 
whereas simultaneous process happens in reference model used and also BILBO starts generating patterns at 
the same time, from the environment we are injecting the faults, one time stuck at 0/1 fault injected, and we 
see resultant is wrong than expected as in the Same BILBO logic gives a significant value as false, then the 
design will be corrected if BILBO passes as good signature it is failed to verify the design, hence the design 
should be modified depends on logic preferred. 

Hence, the process of testing continues with various injections of faults, and results are compared 
using a checker. According to the research, many BIST architectures had been proposed but BILBO has 
played a vital role in the present generation as in SAM Project, we can configure it as an input generation of 
patterns in a full environment as shown in Figure 6, and also can be configured as output analyzer. 
Depending on the selection of inputs like bl and b2, the mode can be selected. Various fault models 
discussed in [27]-[29] Compare to all techniques BIST technique is more popular because of its low power 
and less time of execution, complex designs also get testing done very fast, BILBO called LOGIC BIST 
because of using BIST as the main component in it and used for operating modes. Mainly in this project, a 
reversible multiplier is used for testing using the reversible BILBO logic applied for finding two main faults 
SAF, MSAF, and MGF faults of the design. Stuck at faults are rare faults that occur in designs and can be 
more complexes to find the faults whether to zero or one, Multiple stuck at faults also a rare finding of faults 
in conventional designs and Missing gate fault changes the output of the design, finding these types of faults 
are the most important nowadays to make fault free system designs [30], [31]. 
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Proposed System 
with components 


: d DUT Overall 
ss | | NM | [ mise | LFSR | a pass/fail 


BILBO 


Fault Injections s-a-O s-a-1 MGF-missing gate faults 


S-Serial Scan ,NM-Normal Mode,MISR-Multi input Signature register, LFSR-Linear feedback shift register 


Figure 6. Full environment and testing with proposed systolic array multiplier using fault injection schemes 


3. RESULTS AND DISCUSSION 

Figure 7 shows the systolic array multiplier resultant using the new modified Islam gate and the 
resultant can be calculated from the below fig as 1111*1111=011100001. From Figure 8 we can say that 
various patterns have been generated for the Sam circuit which gets resultant true as it is mentioned in 
decimal 14*15=210. Internal blocks of the design gates output resultant are shown in Figure 9. 

In Figure 10, the concept of injection logic tried to inject the faults by missing some of the gates in 
the design which resulted in missing gate fault but here we can see the output does not break because of the 
reversible logic gates usage. Figure 11 shows the pattern generated from BILBO logic of LFSR mode, which 
generates random patterns as shown. 

From Figure 12 and comparison values generated from the BILBO logic which proves stuck at fault 
findings at nearest value, as the design gets tested and compared with the existing signature after injecting 
faults. 


001011010 000000000 011100001 011010010 011000011 010110100 010100101 010010110 010000111 001111000 001101001 


0110 0000 1010 1000 O11 
SSS SS eS eS ee eS ee 


ull a a ne zzz 


—- |__| 
a a a eS eS ee 4 
al ba a NL a | Ne fa el See el 


— 


X1: 1,000.000 ns 


Figure 9. Resultant of SAM internal blocks COG and MIG gates 


Design and testing of systolic array multiplier using fault ... (Kurada Verra Bhoga Vasantha Rayudu) 


6 0 ISSN: 2722-3221 


MGF 


, i Se, oe Se ef 
a2aa2 


5 
a 


* 
ev 
De 
+ 
Vs 
“ 
+ 
ie v 
” 


Conia X 0100 YCa00i [oor Xoxio Xabi X ioio Xo101 Yaoi Kouns XAmia Y ahao \ 1100 i000 YCo001 X..) 


SAF 


Figure 12. Resultant of BILBO MISR mode signature comparison 


Finding test vector of the resultant at stuck at 0/1 is FATLED----The output is correct at required places 
xl=1, x2=0, x3=0, x4=1, x5=0, scan_in=1, out=1, 3100 

Finding test vector of the resultant at stuck at 0/1 is PASSED 

x1=0, x2=0, x3=1, x4=0, x5=0, scan_in=0, out=0, 3200 

Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places 
xl=1, x2=0, x3=0, x4=0, x5=0, scan_in=0, out=0, 3300 

Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places 
x1=0, x2=1, x3=1, x4=0, x5=0, scan_in=0, out=1, 3400 

Finding test vector of the resultant at stuck at 0/1 is PASSED 

xl=1, x2=0, x3=0, x4=1, x5=0, scan_in=1, out=0, 3500 

Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places 
x1=0, x2=0, x3=1, x4=0, x5=0, scan_in=0, out=1, 3600 

Finding test vector of the resultant at stuck at 0/1 is PASSED 

xl=1, x2=0, x3=0, x4=0, x5=0, scan_in=0, out=0, 3700 

Finding test vector of the resultant at stuck at 0/1 is FAILED----The output is correct at required places 
x1=0, x2=1, x3=1, x4=0, x5=0, scan_in=0, out=0, 3800 

Finding test vector of the resultant at stuck at 0/1 is FATLED----The output is correct at required places 
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xl=1, x2=0, x3=0, x4=1, x5=0, scan_in=1, out=1, 3900 
Finding test vector of the resultant at stuck at 0/1 is PASSED 
x1=0, x2=0, x3=1, x4=0, x5=0, scan_in=0, out=0, 4000 


Table 1 and Table 2 have been shown a comparison of different multipliers for fault analysis of 
conventional and proposed design and also fault analysis at stuck-at faults, table values are collected using 
synthesis process of Xilinx ISE, where we have used vertex family for FPGA designs and improved the 
execution time unit. 


Table 1. Comparison of multipliers from BILBO logic 


Fault analysis Conventional multiplier [10] Proposed multiplier 
Good signature 200 200 
No of faults 138 138 
No of faults detected 130 134 
Fault coverage 96% 97% 


Table 2. Comparison of multipliers after synthesizing the design using XILINX ISE 14.7 


Local utilization Conventional multiplier [10] Proposed multiplier 
No of slices 76.11% 70.2% 
No of 4 input LUTs 26% 25% 
Time delays 28.24% 28% 
Area covered 75% 68% 


4. CONCLUSION 

Compared to the existing system designs, we proved that the design of the modified gate of systolic 
array multiplier design works faster because of reversible gate which has equal no of inputs and outputs 
which process the information faster and used for many low power high-speed applications. There is much 
scope to optimize the designs using the new reversible gates implementation. The proposed MIG gate 
reduces the gate count by 10% compared to the conventional designs and all other parameters to optimization 
mark. Most efficient testing was also done for SAM circuit to find the convenient faults as SAF and MGF 
preferably, we achieved coverage of patterns generation tested as 100%. Moreover, BILBO logic is 
implemented and is used for finding various faults for various system designs. Fault coverage using BILBO 
logic achieved 97% higher than the convention system designs. Future designs of SOC or subsystems can 
integrate and use for the detection of fault blocks of the design. 
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